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Abstract. The Border algorithm and the iPred algorithm find the Hasse 
diagrams of FCA lattices. We show that they can be generalized to ar- 
bitrary lattices. In the case of iPred, this requires the identification of a 
join-semilattice homomorphism into a distributive lattice. 
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1 Introduction 

Lattices are mathematical structures with many applications in computer sci- 
ence; among these, we are interested in fields like data mining, machine learning, 
or knowledge discovery in databases. One classical use of lattice theory is in for- 
mal concept analysis (FCA) [B] , where the concept lattice with its diagram graph 
allows for the visualization and summarization of data in a more concise repre- 
sentation. In the Data Mining community, the same mathematical notions (often 
under additional "frequency" constraints that bound from below the size of the 
support set) are studied under the banner of Closed-Set Mining (see e.g. pTj). 

In these applications, data consists of transactions, also called objects, each 
of which, besides having received a unique identifier, consists of a set of items or 
attributes taken from a previously agreed finite set. A concept is a pair formed 
by a set of transactions — the extent set or support set of the concept — and a 
set of attributes — the intent set of the concept — defined as the set of all those 
attributes that are shared by all the transactions present in the extent. Some 
data analysis processes are based on the family of all intents (the "closures" 
stemming from the dataset); but others require to determine also their order 
relation, which is a finite lattice, in the form of a line graph (the Hasse diagram). 

Existing algorithms can be divided into three main types: the ones that only 
generate the set of concepts, the ones that first generate the set of concepts 
and then construct the Hasse diagram, and the ones that construct the diagram 
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while computing the lattice elements (see [3T] , and also [9112] and the references 
therein). The goal is to obtain the concept lattice in linear time in the number 
of concepts because this number is, most of the times, already exponential in 
the number of attributes, making the task of getting polynomial algorithms in 
the size of the input rather impossible. 

One widespread use of concepts or closures is the generation of implications 
or of partial implications (also called association rules). Several data mining al- 
gorithms aim at processing large datasets in time linear in the size of the closure 
space, and explore closed sets individually; these solutions tend to drown the 
user under a deluge of partial implications. More sophisticated works attempt 
at providing selected "bases" of partial implications; the early proposal in |13) 
requires to compute immediate predecessors, that is, the Hasse diagram. Alterna- 
tive proposals such as the Essential Rules of [I] or the equivalent Representative 
Rules of (of which a detailed discussion with new characterizations and an 
alternative basis proposal appears in [5J) require to process predecessors of closed 
sets obeying tightly certain support inequalities; these algorithms also benefit 
from the Hasse diagram, as the slow alternatives are blind repeated traversal of 
the closed sets in time quadratic in the size of the closure space, or storage of 
all predecessors of each closed set, which soon becomes large enough to impose 
a considerable penalty on the running times. 

The problem of constructing the Hasse diagram of an arbitrary finite lattice is 
less studied. One algorithm that has a better worst case complexity than various 
previous works is described in (16j . From our "arbitrary lattices" perspective, 
its main drawback is that it requires the availability of a basis from which each 
element of the lattice can be derived. In the absence of such a subset, one may 
still use this algorithm (at a greater computational cost) to output the Dedekind- 
MacNeille completion [7] of the given lattice, which in our case is isomorphic to 
the lattice itself. The algorithm is also easily adaptable to concept lattices, where 
indeed a basis is available immediately from the dataset transactions. 

We consider of interest to have available further, faster algorithms for arbi- 
trary finite lattices; we have two reasons for this aim. First, many (although not 
all) algorithms constructing Hasse diagrams traverse concepts in layers defined 
by the size of the intents; our explorations about association rules sometimes 
require to follow different orderings, so that a more abstract approach is helpful; 
second, we keep in mind the application area corresponding to certain variants 
of implications and database dependencies that are characterized by lattices of 
equivalence relations, so that we are interested in laying a strong foundation 
that gives us a clear picture of the applicability requirements for each algorithm 
constructing Hasse diagrams in lattices other than powerset sublattices. 

Of course, we expect that FCA-oriented algorithms could be a good source 
of inspiration for the design of algorithms applicable in the general case. An 
example that such an extension can be done is the algorithm in (50] (see Section 
[3] for more details), whose highest-level description matches the general case 
of arbitrary lattices; nevertheless, the actual implementation described in [3D] 
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works strictly for formal concept lattices, so that further implementations and 
complexity analyses are not readily available for arbitrary finite lattices. 

The contribution of the present paper supports the same idea: we show how 
two existing algorithms that build the Hasse diagrams of a concept lattice can 
be adapted to work for arbitrary lattices. Both algorithms have in common the 
notion of border, which we (re-)define and formalize in Section[31 after presenting 
some preliminary notions about lattice theory in Section^ our approach has the 
specific interest that the notion of border is given just in terms of the ordering 
relation, and not in terms of a set of elements already processed as in previous 
references f [5114120] ) ; yet, the notions are equivalent. We state and prove prop- 
erties of borders and describe the Generalized Border Algorithm; whereas the 
algorithm reads, in high level, exactly as in previous references, its validation is 
new, as previous ones depended on the lattice being an FCA lattice. In Section 
[4] we introduce the Generalized iPred Algorithm, exporting the iPred algorithm 
of FCA lattices [5] to arbitrary lattices, after arguing its correctness. This task 
is far from trivial and is our major contribution, since the existing rendering and 
validation of the iPred algorithm relies again extensively on the fact that it is 
being applied to an FCA lattice, and even performs operations on difference sets 
that may not belong to the closure space. Concluding remarks and future work 
ideas are presented in Section [5] 

2 Preliminaries 

We develop all our work in terms of lattices and semilattices; see [7] as main 
source. All our structures are finite. A lattice is a partially ordered set in which 
every nonempty subset has a meet (greatest lower bound) and a join (lowest 
upper bound) . If only one of these two operations is guaranteed to be available a 
priori, we speak of a join-semilattice or a meet-semilattice as convenient. Top and 
bottom elements are denoted T and _L, respectively. Lower case letters, possibly 
with primes, and taken usually from the end of the latin alphabet denote lattice 
elements: x, y'. Note that Galois connections are not explicitly present in this 
paper, so that the "prime" notation does not refer to the operations of Galois 
connections. 

Finite semilattices can be extended into lattices by addition of at most one 
further element [7J; for instance, if (£, <,V) is a join-semilattice with bottom 
element _!_, one can define a meet operation as follows: f\X = \/{y | Va; 6 X,y < 
x}; the element _L ensures that this set is nonempty. Thus, if the join-semilattice 
lacks a bottom element, it suffices to add an "artificial" one to obtain a lattice. 
A dual process is obviously possible in meet-semilattices. 

Given two join-semilattices (S, V) and (T, V), a homomorphism is a function 
/ : S — > T such that f(x Vy) = f(x) V f(y). Hence / is just a homomorphism of 
the two semigroups associated with the two semilattices. If S and T both include 
a bottom element _L, then / should also be a monoid homomorphism, i.e. we 
additionally require that /(-L) = _L Homomorphisms of meet-semilattices and 
of lattices are defined similarly. It is easy to check that x < y f(x) < f(y) for 
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any homomorphism /; the converse implication, thus the equivalence x < y 
f( x ) < f(y)i i s a l so true for injective / but not guaranteed in general. 

We must point out here a simple but crucial fact that plays a role in our 
later developments: given a homomorphism / between two join-semilattices S 
and T, if we extend both into lattices as just indicated, then / is not necessarily 
a lattice homomorphism; for instance, there could be elements of T that do not 
belong to the image set of /, and they may become meets of subsets of T in a 
way that prevents them to be the image of the corresponding meet of S. For one 
specific example, see Figured] consider the two join-semilattices defined by the 
solid lines, where the numbering defines an injective homomorphism from the 
join-semilattice in (a) to the join-semilattice in (b). Both lack a bottom element. 
Upon adding it, as indicated by the broken lines, in lattice (a) the meets of 1 
and 2 and of 1 and 3 coincide, but the meets of their corresponding images in 
(b) do not; for this reason, the homomorphism cannot be extended to the whole 
lattices. 




(a) (b) 
Fig. 1. Two join-semilattices converted into lattices 



However, the following does hold: 

Lemma 1. Consider two join-semilattices S and T , and let f : S —> T be a 
homomorphism. After extending both semilattices into lattices, f{/\Y) < /\ f(Y) 
for all ycS. 

This is immediate to see by considering that f\ Y < y for all y S Y, hence 
f(/\Y) < f(y) for all such y, and the claimed inequality follows. 

We employ x < y as the usual shorthand: x < y and x ^ y. We denote as 
x -< y the fact that x is an immediate predecessor of y in £, that is, x < y and, 
for all z, x < z < y implies z = y (equivalently, x < z < y implies x = z). 
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We focus on algorithms that have access to an underlying finite lattice £ of 
size \£\ = n, with ordering denoted <; abusing language slightly, we denote by 
£ as well its carrier set. The width w(£) of the lattice £ is the maximum size 
of an antichain (a subset of £ formed by pairwise incomparable elements). The 
lattice is assumed to be available for our algorithms in the form of an abstract 
data type offering an iterator that traverses all the elements of the carrier set, 
together with the operations of testing for the ordering (given x, y E £, find out 
whether x < y) and computing the meet x Ay and join x V y of x, y E £; also 
the constants T E £ and 1 £ £ are assumed available. 

The algorithms we consider are to perform the task of constructing explicitly 
the Hasse diagram (also known as the reflexive and transitive reduction) of the 
given lattice: H(£) — {(x 7 y) I x ~< y}. By projecting the Hasse diagram along 
the first or the second component we find our crucial ingredients: the well-known 
upper and lower covers. 

Definition 1. The upper cover of x E £ is uc(x) = {y | x -< y}. The lower 
cover of y E £ is \c(y) = {x | x ~< y}. 

The following immediate fact is stated separately just for purposes of easy 
later reference: 

Proposition 1. If x < y then there is z E uc(x) such that x ~< z < y; and there 
is z' E lc(y) such that x < z' ~< y. 

We will use as well yet another easy technicality: 

Lemma 2. If xi -< y and x 2 -< y, with x\ ^ x 2 then x\ V x 2 = y. 

Proof. Since y > x\ and y > x 2 we have y > x± V x 2 . Then, x\ ^ x 2 implies that 
they are mutually incomparable, since otherwise the smallest is not an immediate 
predecessor of y; this implies that y > x\\lx 2 > x\, whence y = xiVx 2 as x\ < y. 

□ 

3 The Border Algorithm in Lattices 

The algorithms we are considering here have in common the fact that they 
traverse the lattice and explicitly maintain a subset of the elements seen so far: 
those that still might be used to identify new Hasse edges. This subset is known 
as the "border" and, as it evolves during the traversal, actually each clement 
x E £ "gets its own border" associated as the algorithm reaches it. The border 
associated to an element may be potentially used to construct new edges touching 
it (although these edges may not touch the border elements themselves): more 
precisely, operations on the border for x will result in uc(x), hence in the Hasse 
edges of the form (x,z). 

In previous references the border is defined in terms of the elements already 
processed, and its properties are mixed with those of the algorithm that uses 
it. Instead, we study axiomatically the properties of the notion of "border" on 
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itself, always as a function of the element for which the border will be considered 
as a source of Hasse edges, in a manner that is independent of the fact that 
one is traversing the lattice. This allows us to clarify which abstract properties 
are necessary for border-based algorithms, so that we can generalize them to 
arbitrary lattices, traversed in flexible ways. Our key definition is, therefore: 

Definition 2. Given x G C and B C C, B is a border for x if the following 
properties hold: 

1. VyeB(y£x); 

2. Vz (x -< z =>• 3y G B (y < z)). 

That is, x is never above an element of a border, but each upper cover of x 
is; this last condition is equivalent to: all elements strictly above x are greater 
than or equal to some element of the border. Since x < (x V y) always holds and 
x = [x V y) if and only if y < x, we get: 

Lemma 3. Let B be a border for x. Then Vy G B (x < x V y) . 

All our borders will fulfill an extra "antichain" condition; the only use to be 
made of this fact is to bound the size of every border by the width of the lattice. 

Definition 3. A border B is proper if every two different elements of B are 
mutually incomparable. 

The key property of borders, that shows how to extract Hasse edges from 
them, is the following: 

Theorem 1. Let B be a border for xq. For all X\ with xq < x\, the following 
are equivalent: 

1. x\ G uc(xo) (that is, xq -< x\); 

2. there is y G B such that xi = (x^Wy) and, for all z G B, if (xqV z) < (xoVy) 
then [xq V z) — (xq V y). 

Proof. Given xq -< x\, we can apply the second condition in the definition of 
border for xq: By G B (y < x\). Using Lemma |3l xq < (xq V y) < xx, implying 
(xo V y) = Xi since £o -< Additionally, assuming (xq V z) < (xq V y) for some 
z G B leads likewise to xq < (xq V z) < (xq V y) = x\ and the same property 
applies to obtain (xq V z) = (xq V y) = x\. 

Conversely, again Lemma [3] gives xq < (xq V y) = x%. By Proposition [1] 
there is zq G uc(.to) with xq -< zq < (xq V y) = x\. We apply the second 
condition of borders to xq -< Zq to obtain z\ G B with z\ < zq, whence (xo V 
zi) < zq < (xq V y) — x\, allowing us to apply the hypothesis of this direction: 
(#o V zi) < (xq V y) with z\ G B implies (xq V zi) = (xq V y) and, therefore, 
(xq V zi) = zq = (xq V y) = x%. That is, x\ = zq G uc(xo)- □ 

Therefore, given an arbitrary element xq of the lattice, any candidate for 
being an element of its upper cover has to be obtainable as a join between Xq 
and a border element (x% = xq V y for some y G B). Moreover, among these 
candidates, only those that are minimals represent immediate successors: they 
come from those y where (xq V z) < (xq V y) implies (xq V z) = (xo V y), for all 
z G B. 
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3.1 Advancing Borders 

There is a naturally intuitive operation on borders; if we have a border B for x, 
and we use it to compute the upper cover of x, then we do not need B as such 
anymore; to update it, seeing that we no longer need to forbid the membership of 
x, it is natural to consider adding x to the border. If we had a proper border, and 
we wished to preserve the antichain property, the elements to be removed would 
be exactly the upper cover just computed, as these are, as we argue below, the 
only elements comparable to x that could be in a proper border. (All elements 
other than x are mutually incomparable, as the border was proper to start with.) 

Definition 4. Given x E C and a border B for x, the standard step for B and 
x is B U {x} — uc(x). 

Note that this is not to say that uc(x) C B; elements of uc(a;) may or may 
not appear in B. We will apply the standard step always when B is a border 
for x, but let us point out that the definition would be also valid without this 
constraint, as it consists of just some set-theoretic operations. 

Proposition 2. Let B be a proper border for x. Then the standard step for B 
and x is also an antichain. 

Proof. Elements of the standard step different from x and from all elements of 
uc(x) were already in the previous proper border and are, therefore, mutually 
incomparable. None of them is below x, by the first border property. If y > x 
for some y 6 B, then y > z y x for some z <E B, and the antichain property of 
B tells us that y = z so that it gets removed with uc(x). □ 

However, we are left with the problem that we have now a candidate border 
but we lack the lattice element for which it is intended to be a border. In [Tl] 
and [5], the algorithm moves on to an intent set of the same cardinality as x, 
whenever possible, and to as small as possible a larger intent set if all intents of 
the same cardinality are exhausted. In [20] it is shown that, for their variant of 
the Border algorithm, it suffices to follow a (reversed) linear embedding of the 
lattice. Here we follow this more flexible approach, which is easier now that we 
have stated the necessary properties of borders with no reference to the order of 
traversal: there is no need of considering intent sets and their cardinalities. 

Both lattices and their Hasse diagrams can be seen as directed acyclic graphs, 
by orienting the inequalities in either direction; here we choose to visualize edges 
(:r, y) as corresponding to x < y. A linear embedding corresponds to the well- 
known operation of topological sort of directed acyclic graphs, which we will 
employ for lattices in a "reversed" way: 

Definition 5. A reverse topological sort of C is a total ordering xi, . . . ,x n of £ 
such that Xi < Xj always implies j < i. 

All our development could be performed with a standard topological sort, 
not reversed, that is, a linear embedding of the lattice's partial order. However, 
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as it is customary in FCA to guide the visualization through the comparison 
of extents, the algorithms we build on were developed with a sort of "built-in 
reversal" that we inherit through reversing the topological sort (see the similar 
discussion in Section 2.1 of 5 .). A reversed topological sort must start with T, 
hence the initialization is easy: 

Proposition 3. B = is a border for Te£. 

Proof. Both conditions in the definition of border become vacuously true: the 
first one as B = and the second one as the top element has no upper covers. □ 

Theorem 2. Let x\,...,x n be a reverse topological sort of C Starting with 
B\ = 0, define inductively Bk+i as the standard step for Bk and Xk- Then, for 
each k, Bk is a border for Xk- 

For clarity, we factor off the proof of the following inductive technical fact, 
where we use the same notation as in the previous statement. 

Lemma 4. Bk C {x\, . . . ,Xk-i} and, for all Xj with j < k, there is y 6 Bk 
with y < Xj. 

Proof. For k = 1, the statements are vacuously true. Assume it true for fc, 
and consider Bk+i = Bk U {xk} — uc(xk), the standard step for Bk and Xk- 
The first statement is clearly true. For the second, Xk is itself in Bk+i and, for 
the rest, inductively, there is y G Bk with y < xj. We consider two cases; if 
y ^ uc(xfc), then the same y remains in Bk+i] otherwise, Xk -< y < Xj, and Xk is 
the corresponding new y in Bk+i- □ 

Proof (of Theorem^). Again by induction on fc; we see that the basis is Propo- 
sition[3] Assuming that Bk is a border for Xk, we consider Bk+\ = Bk U {xk} — 
uc(xfc). Applying the lemma, Bk+i C {x\, . . . which ensures immediately 

that Vy s Bk+i (y ^ Xk+i) by the property of the reverse topological sort, and 
the first condition of borders follows. For the second, pick any z G uc(xfc+i); 
by the condition of reverse topological sort, z, being a strictly larger element 
than x/c+i, must appear earlier than it, so that z = Xj with j < k + 1. Then, 
again the lemma tells us immediately that there is y 6 Bk+\ with y < xj = z, 
as we need to complete the proof. □ 

3.2 The Generalized Border Algorithm 

The algorithm we end up validating through our theorems has almost the same 
high-level description as the rendering in [5]; the most conspicuous differences 
are: first, that a reverse topological sort is used to initialize the traversal of the 
lattice; and, second, that the "reversed lattice" model in [S] has the consequence 
that their set-theoretic intersection in computing candidates becomes a lattice 



Border Algorithms for Computing Hasse Diagrams of Arbitrary Lattices 



9 



join in our generalization. Another minor difference is that Proposition [3] spares 
us the separate handling of the first element of the lattice. 

RevTopSort(£); 

5 = 0; 

H = 0; 

for x in C, according to the sort do 

candidates = {xV y | y € B}; 

cover = minimals (candidates); 

for z in cover do add (x, z) to H; 

B = B U {x} — cover; 
end 

Algorithm 1: The Generalized Border Algorithm 

Theorem [2] and Proposition [3] tell us that the following invariant is main- 
tained: B is a border for x. Then, the Hasse edges are computed and added 
to H according to Theorem [TJ in two steps: first, we prepare the list of joins 
xVy and, then, we keep only the minimal elements in it. In essence, this process 
is the same as described (in somewhat different renderings) in [5], [H] or [20] : 
however, while the definition of border given in [2DJ (and recalled in [5]) leads, 
eventually, to the same notion employed in this paper, further development of 
a general algorithm that works outside the formal concept analysis framework 
is dropped off from [2DJ on efficiency considerations. Moreover, the border algo- 
rithm described in [14] works exclusively on the set of intents and assumes the 
elements are sorted sizewise. The validations of the algorithms in these references 
rely very much, at some points, on the fact that the lattice is a sublattice of a 
powerset and contains formal concepts, explicitly operating set-theoretically on 
their intents. Theorem [T] captures the essence of the notion of border and lifts 
the algorithm to arbitrary lattices. 

One additional difference comes from the fact that the cost of computing 
the meet and join operations plays a role in the complexity analysis, but is 
not available in the general case. If we assume that meet and join operations 
take constant time, then the total running time of the algorithm (except for the 
sort initialization, which takes 0(\£\ log \C\)) is bounded by 0(\C\w(C) 2 ). By 
comparison with [2D], one can see that one factor of the formula given in [2D] 
gets dropped under the constant time assumption for computing meet and join. 
However, this assumption may be unreasonable in certain applications; the same 
reference indicates that their FCA target case requires a considerable amount 
of graph search for the same operations. Nevertheless, in absence of further 
information about the specific lattice at hand, it is not possible to provide a 
finer analysis. 

We must point out that, in our implementation, we have employed a heapsort- 
based version that keeps providing us the next element to handle by means of 
an iterator, instead of completing the sorting step for the initialization. 
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4 Distributivity and the iPred Algorithm 

In [5], an extra sophistication is introduced that, as demonstrated both formally 
in the complexity analysis of the algorithm and also practically, leads to a faster 
algorithm; namely, if some further information is maintained along, once the 
candidates are available there is a constant-time test to pick those that are in 
the cover, by employing the duality y £ uc(x) x E lc(y) <^> x -< y. Constant 
time also suffices to maintain the additional information. This gives the iPred 
algorithm. However, it seems that the unavoidable price is to work on formal 
concepts, as the extra information is heavily set-theoretic (namely, a union of 
set differences of previously found cover sets for the candidate under study). 

Again we show that a fully abstract, lattice-theoretic interpretation exists, 
and we show that the essential property that allows for the algorithm to work is 
distributivity: be it due to a distributive C, or, as in fact happens in iPred, due 
to the embedding of the lattice into a distributive lattice, in the same way as 
concept lattices (possibly nondistributive) can be embedded in the distributive 
powerset lattice. 

We start treating the simplest case, of very limited usefulness in itself but 
good as stepping stone towards the next theorem. The property where distribu- 
tivity can be applied later, if available, is as follows: 

Proposition 4. Consider two comparable elements, x < z, from C; let Y C 
lc(z) be the set of lower covers of z that show up in the reverse topological sort 
before x (it could be empty). Then, x € lc(z) if and only if l\ y! z Y {x V V) — z - 

Proof. Applying Proposition [TJ we know that there is some y £ lc(z) such that 
x < y -< z. Any such y, if different from x, must appear before x in the reverse 
topological sort. 

Suppose first that no lower covers of z appear before x, that is, Y = 0. 
Then, no such y different from x can exist; we have that both x = y < z and 
f\ yeY (x v v) = T > z trivially hold. 

In case Y is nonempty, assume first x -< z; we can apply Lemma [U iV y = z 
for every y £F , hence f\ y( z Y {xV y) = z - To argue the converse, assume x ^ lc(z) 
and let x < y' -< z as before, where we know further that x =^ y': then y' G Y , 
so that /\ yeY (xVy) < {x Vy') = y' < z. □ 

This means that the test for minimality of Algorithm [T] can be replaced by 
checking the indicated inequality; but it is unclear that we really save time, as 
a number of joins have to be performed (between the current element x and all 
the elements in the lower cover of the candidate z that appeared before x in 
the reverse topological sort) and the meet of their results computed. However, 
clearly, in distributive lattices the test can be rephrased in the following, more 
convenient form: 

Proposition 5. Assume C distributive. In the same conditions as in the previ- 
ous proposition, x is in the lower cover of z if and only if xV (f\ y£Y V) — z - 
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This last version of the test is algorithmically useful: as we keep identify- 
ing elements Y — {yx, • ■ ■ ,y m } of lc(z), we can maintain the value of y = 
AiG{i m} Vi'i then, we can test a candidate z by computing iVj and com- 
paring this value to z. Afterwards, we update y to y A x if x — y m +i is indeed 
in the cover. This may save the loop that tests for minimality at a small price. 

However, unfortunately, if the lattice is not distributive, this faster test may 
fail: given Y C lc(z), the cover elements found so far along the reverse topological 
sort, it is always true that x is in the lower cover of z if x V i/\ yeY V) — z i because 
z < x\/(/\ yeY y) < /\ yeY {xV y) an d> then, one of the directions of Proposition!!] 
applies; but the converse does not hold in general. Again an example is furnished 
by Figure Ufa), one of the basic, standard examples of a small nondistributive 
lattice; assume that the traversal follows the natural ordering of the labels, and 
consider what happens after seeing that 1 and 2 are indeed lower covers of z = T. 
Upon considering x = 3, we have Y = {1, 2}, so that iV (f\Y) = xV J_ = x < z, 
yet x is a lower cover of z and, in fact, /\ yeY { x v v) = (3 V 1) A (3 V 2) = T. 
Hence, the distributivity condition is necessary for the correctness of the faster 
test. 

4.1 The Generalized iPred Algorithm 

The aim of this subsection is to show the main contribution of this paper: we can 
spare the loop that tests candidates for minimality in an indirect way, whenever 
a distributive lattice is available where we can embed C. However, we must 
be careful in how the embedding is performed: the right tool is an injective 
homomorphism of join-semilattices. Recall that, often, this will not be a lattice 
morphism. Such an example is the identity morphism having as domain the 
carrier set of a concept lattice £ over the set of attributes X, and as range, 
V{X) (see Section [5] for more details on this particular case). 

Theorem 3. Let (£', <, V) be a distributive join-semilattice and f : C C an 
injective homomorphism. Consider two comparable elements, x < z, from C; let 
Y C lc(z) be the set of lower covers of z that show up in the reverse topological 
sort before x.Then, x ~< z if and only if f{x) V (/\ yeY f(y)) > f{z). 

Proof. If Y — we have x -< z as in Proposition 01 for this case, Aj,ey f(y) = T 
(of £') and f{x) V {/\ yeY f{y)) = f(x) V T = T > /(*). 

For the case where F / I, assume first x -< z and apply Proposition |4j 
we have that /\ yeY (x V y) > z whence f(/\ yeY {x V y)) > f{z). By Lemma (TJ 
we obtain /(*) < f(A yeY (x V y)) < /\ yeY f(x V y) = A yeY (f(x) V f(y)) = 
f{x) V /\ yeY f{y), where we have applied that / commutes with join and that 
C! is distributive. 

For the converse, arguing along the same lines as in Proposition |4j assume 
x \c(z) and let x < y' -< z with x ^ y' so that y' S Y: necessarily /\ yeY f(y) < 
f(y>), so that f(x) V (A yeY f(y)) < f(x) V f(y>) = fix V y>) = fiy') < f(z), 
where the last step makes use of injectiveness. □ 



12 Jose L Balcazar, Cristina Tirnauca 



The generalized iPred algorithm is based on this theorem, which proves it 
correct. In it, the homomorphism / is assumed available, and table LC keeps, 
for each z, the meet of the f(x)'s for all the lower covers x of z seen so far. 

RevTopSort(£); 
B = %- 

H = 0; 

for x in C, according to the sort do 
LC[x] = T; 

candidates = {xy y | y € B}; 
for z in candidates do 

if f{x)VLC[z] > f(z) then 
add (x, z) to H; 
LC[z] = LC[z] A f(x); 
B = B-{z}; 
end 
end 

B = BU{i}; 
end 

Algorithm 2: The Generalized iPred Algorithm 

In the Appendix below, we provide some example runs for further clarifica- 
tion. Regarding the time complexity, again we lack information about the cost of 
meets, joins, and comparisons in both lattices, and also about the cost of com- 
puting the homomorphism. Assuming constant time for these operations, the 
running time of the generalized iPred algorithm is 0(\C\w{C)) (plus sorting): 
the main loop (line 4-15) is repeated \C\ times, and then for each of the at most 
w(£) candidates, the algorithm checks if a certain condition is met (in constant 
time) and updates the diagram and the border in the positive case. 

If meets and joins do not take constant time, there is little to say at this level 
of generality; however, for the particular case of the original iPred, which only 
works for lattices of formal concepts, see [S]: in the running time analysis there, 
one extra factor appears since the meet operation (corresponding to a set union 
plus a closure operation) is not guaranteed to work in constant time. 

5 Conclusions and Future Work 

Wc have provided a formal framework for the task of computing Hasse diagrams 
of arbitrary lattices through the notion of "border associated with a lattice 
element" . Although the concept of border itself is not new, our approach provides 
a different, more "axiomatic" point of view that facilitates considerably the 
application of this notion to algorithms that construct Hasse diagrams outside 
the formal concept analysis world. 

While Algorithm Q] is a clear, straightforward generalization of the Border 
algorithm of [2015] (although the correctness proof is far less straightforward), 
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we consider that we should explain further in what sense the iPred algorithm 
comes out as a particular case of Algorithm [21 In fact, the iPred algorithm uses 
set-theoretic operations and, therefore, is operating with sets that do not belong 
to the closure space: effectively, it has moved out of the concept lattice into the 
(distributive) powerset lattice. Starting from a concept lattice (£, <,V,A) on a 
set X of attributes, we can define: 

— x < y x D y 

— x\/ y xC\y 

— x Ay := \J{z <E £ \ z < x, z < y} = f]{z EjC\zZ)x,zDy} 
-T:=0,_L:=X 

Thus, £ is a join-subsemilattice of the (reversed) powerset on X, and we can 
define / : C —¥ V(X) as the identity function: it is injective, and it is a join- 
homomorphism since C, being a concept lattice, is closed under set-theoretic 
intersection. Therefore, Theorem [3] can be translated to: x £ lc(z) if and only if 
x H (Ut/gy V) — 2 ' wnere y is the set of lower covers of z already found; this is 
fully equivalent to the condition behind algorithm iPred of [5] (see Proposition 
1 on page 169 in [5 ). Additionally, iPred works on one specific topological sort, 
where all intents of the same cardinality appear together; our generalization 
shows that this is not necessary: any linear embedding suffices. 

A further application we have in mind refers to various forms of implica- 
tion known as multivalued dependency clauses |17ll8j : in [21314) . these clauses 
are shown to be related to partition lattices in a similar way as implications 
are related to concept lattices through the Guigues-Duquenne basis ( |8ll0j ): fur- 
ther, certain database dependencies (the degenerate multivalued dependencies of 
[17118] ) are related to these clauses in the same way as functional dependencies 
correspond to implications. Data Mining algorithms that extract multivalued 
dependencies do exist [115] but we believe that alternative ones can be designed 
using Hasse diagrams of the corresponding partition lattices or related structures 
like split set lattices [2J. The task is not immediate, as functional and degen- 
erate multivalued dependencies are of the so-called "equality-generating" sort 
but full-fledged multivalued dependencies are of the so-called "tuple-generating" 
sort, and their connection to lattices is more sophisticated (see [2]); but we still 
hope that further work along this lattice-theoretic approach to Hasse diagrams 
would allow us to create a novel application to multivalued dependency mining. 

6 Appendix 

We exemplify here some runs of iPred, for the sake of clarity. First we see how 
it operates on the lattice in Figure 0Ja), denoted C here, using as / the injective 
homomorphism into the distributive lattice of Figure[TJb) provided by the labels. 
The run is reported in Table [TJ where we can see that we identify the respective 
upper covers of each of the lattice elements in turn. The linear order is assumed 
to be (T, 1, 2, 3, _L). Only the last loop has more than one candidate, in fact 
three. The snapshots of the values of B, H, and LC reported in each row (except 
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the initialization) are taken at the end of the corresponding loop, so that each 
reported value of B is a border for the next row. In the Hasse edges H , thin 
lines represent edges that are yet to be found, and thick lines represent the edges 
found so far. Recall that the values of LC are actually elements of the distributive 
lattice of Figure QJb), and not from C. 



£ 


B 


H 


cand 


LC[T] 


LC[1] 


LC[2] 


LC[3] 


LCLL] 


init 



















T 


{T} 







T 










1 


{1} 


o 


{T} 


1 


T 








2 


{1,2} 


A 


{T} 


4 


T 


T 






3 


{1,2,3} 


<» 


{T} 


_L 


T 


T 


T 




1 





<3> 


{1,2,3} 


_L 


_L 


_L 


_L 


T 



Table 1. Example run of the iPred algorithm using the lattices in Figure [T] 



All along the run we can see that LC[z] indeed maintains the meet of the 
set of predecessors found so far for f(z) in the distributive embedding lattice; of 
course, this meet is T whenever the set is empty. 

Let us compare with the run on the distributive lattice in Figure [5J where 
the homomorphism / is now the identity. Observe that the only different Hasse 
edge is the one above 3 which now goes to 2 instead of going to T. Again the 
linear sort follows the order of the labels. 



T 




1 



Fig. 2. A distributive lattice 



Due to the similarity among the Hasse diagrams, the run of generalized iPred 
on this lattice starts exactly like the one already given, up to the point where 
node 3 is being processed. At that point, 2 is candidate and will indeed create 
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an edge, but 1 leads to candidate 1 V3 = T for which the test fails, as LC[T] = 4 
at that point, and 3 V 4 = 2 < T. Hence, this candidate has no effect. After 
this, the visits to 4 and J_ complete the Hasse diagram with their corresponding 
upper covers. 
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{1,3} 
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{3,4} 
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4 


_L 
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_L 


_L 


_L 


_L 


T 



Table 2. Example run of the iPred algorithm on the lattice in Figure [2] 
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